Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games Using Baselines

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monte Carlo Sampling for Regret Minimization in Extensive Games

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...

متن کامل

Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization

Online search in games has always been a core interest of artificial intelligence. Advances made in search for perfect information games (such as Chess, Checkers, Go, and Backgammon) have led to AI capable of defeating the world’s top human experts. Search in imperfect information games (such as Poker, Bridge, and Skat) is significantly more challenging due to the complexities introduced by hid...

متن کامل

Online Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games

Online search in games has been a core interest of artificial intelligence. Search in imperfect information games (e.g., Poker, Bridge, Skat) is particularly challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling, an online search variant of Monte Carlo Counterfactual Regret Minimization, which preserves its convergence to Nash eq...

متن کامل

Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization

Recently, there has been considerable progress towards algorithms for approximating Nash equilibrium strategies in extensive games. One such algorithm, Counterfactual Regret Minimization (CFR), has proven to be effective in two-player zero-sum poker domains. While the basic algorithm is iterative and performs a full game traversal on each iteration, sampling based approaches are possible. For i...

متن کامل

Supplemental Material for Monte Carlo Sampling for Regret Minimization in Extensive Games

The supplementary material presented here first presents a detailed description of the MCCFR algorithm. We then give proofs to Theorems 3, 4, and 5 from the submission Monte Carlo Sampling for Regret Minimization in Extensive Games. We begin with some preliminaries, then prove a general result about all members of the MCCFR family of algorithms (Theorem 18 in Section 6). We then use that result...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence

سال: 2019

ISSN: 2374-3468,2159-5399

DOI: 10.1609/aaai.v33i01.33012157